Goto

Collaborating Authors

 dropout layer


Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization

Hyeonwoo Noh, Tackgeun You, Jonghwan Mun, Bohyung Han

Neural Information Processing Systems

Injecting noises to hidden units during training, e.g., dropout, is known as a successful regularizer, but it is still not clear enough why such training techniques work well in practice and how we can maximize their benefit in the presence of two conflicting objectives--optimizing to true data distribution and preventing


Uncertainty Quantification In Surface Landmines and UXO Classification Using MC Dropout

Lekhak, Sagar, Ientilucci, Emmett J., Dera, Dimah, Ghosh, Susmita

arXiv.org Artificial Intelligence

Detecting surface landmines and unexploded ordnances (UXOs) using deep learning has shown promise in humanitarian demining. However, deterministic neural networks can be vulnerable to noisy conditions and adversarial attacks, leading to missed detection or misclassification. This study introduces the idea of uncertainty quantification through Monte Carlo (MC) Dropout, integrated into a fine-tuned ResNet-50 architecture for surface landmine and UXO classification, which was tested on a simulated dataset. Integrating the MC Dropout approach helps quantify epistemic uncertainty, providing an additional metric for prediction reliability, which could be helpful to make more informed decisions in demining operations. Experimental results on clean, adversarially perturbed, and noisy test images demonstrate the model's ability to flag unreliable predictions under challenging conditions. This proof-of-concept study highlights the need for uncertainty quantification in demining, raises awareness about the vulnerability of existing neural networks in demining to adversarial threats, and emphasizes the importance of developing more robust and reliable models for practical applications.


Arabic Hate Speech Identification and Masking in Social Media using Deep Learning Models and Pre-trained Models Fine-tuning

Doghmash, Salam Thabet, Saad, Motaz

arXiv.org Artificial Intelligence

Hate speech identification in social media has become an increasingly important issue in recent years. In this research, we address two problems: 1) to detect hate speech in Arabic text, 2) to clean a given text from hate speech. The meaning of cleaning here is replacing each bad word with stars based on the number of letters for each word. Regarding the first problem, we conduct several experiments using deep learning models and transformers to determine the best model in terms of the F1 score. Regarding second problem, we consider it as a machine translation task, where the input is a sentence containing dirty text and the output is the same sentence with masking the dirty text. The presented methods achieve the best model in hate speech detection with a 92\% Macro F1 score and 95\% accuracy. Regarding the text cleaning experiment, the best result in the hate speech masking model reached 0.3 in BLEU score with 1-gram, which is a good result compared with the state of the art machine translation systems.


From Ground to Air: Noise Robustness in Vision Transformers and CNNs for Event-Based Vehicle Classification with Potential UAV Applications

Almesafri, Nouf, Figueiredo, Hector, Arana-Catania, Miguel

arXiv.org Artificial Intelligence

This study investigates the performance of the two most relevant computer vision deep learning architectures, Convolutional Neural Network and Vision Transformer, for event-based cameras. These cameras capture scene changes, unlike traditional frame-based cameras with capture static images, and are particularly suited for dynamic environments such as UAVs and autonomous vehicles. The deep learning models studied in this work are ResNet34 and ViT B16, fine-tuned on the GEN1 event-based dataset. The research evaluates and compares these models under both standard conditions and in the presence of simulated noise. Initial evaluations on the clean GEN1 dataset reveal that ResNet34 and ViT B16 achieve accuracies of 88% and 86%, respectively, with ResNet34 showing a slight advantage in classification accuracy. However, the ViT B16 model demonstrates notable robustness, particularly given its pre-training on a smaller dataset. Although this study focuses on ground-based vehicle classification, the methodologies and findings hold significant promise for adaptation to UAV contexts, including aerial object classification and event-based vision systems for aviation-related tasks.


Protein Language Model Zero-Shot Fitness Predictions are Improved by Inference-only Dropout

Ravuri, Aditya, Lawrence, Neil D.

arXiv.org Artificial Intelligence

Protein Language Models (PLMs) such as ESM2 (Lin et al., 2023) have been shown to be capable of zero-shot prediction of critical scalar properties of proteins ("fitness", Meier et al. (2021)). In this work, we show that injecting a dropout layer at inference time between a PLM's featurizer/embedding layer and its transformer, and averaging its output akin to Monte-Carlo dropout (Gal & Ghahramani, 2016) increases zero-shot performance on a subset of the ProteinGym dataset (Notin et al., 2023). This is the case even when the model was not trained with dropouts to begin with, and does not require retraining or finetuning of the PLM. A dropout of 0.1 seems performant across all models.


Investigating Active Sampling for Hardness Classification with Vision-Based Tactile Sensors

Chen, Junyi, Kshirsagar, Alap, Heller, Frederik, Andreu, Mario Gómez, Belousov, Boris, Schneider, Tim, Lin, Lisa P. Y., Doerschner, Katja, Drewing, Knut, Peters, Jan

arXiv.org Artificial Intelligence

-- One of the most important object properties that humans and robots perceive through touch is hardness. This paper investigates information-theoretic active sampling strategies for sample-efficient hardness classification with vision-based tactile sensors. We evaluate three probabilistic classifier models and two model-uncertainty-based sampling strategies on a robotic setup as well as on a previously published dataset of samples collected by human testers. Our findings indicate that the active sampling approaches, driven by uncertainty metrics, surpass a random sampling baseline in terms of accuracy and stability. Additionally, while in our human study, the participants achieve an average accuracy of 48 .00% I. INTRODUCTION Robots are increasingly being utilized in a variety of fields, from manufacturing to healthcare, where they interact with objects in their environment and plan their actions based on sensory feedback. A significant challenge in robotics is accurately perceiving object properties. This work focuses on a crucial property perceived through touch: hardness. Specifically, we investigate active sampling strategies for rapid hardness classification with a Vision-Based Tactile Sensor (VBTS). VBTSs like GelSight Mini [1] or FingerVision [2] provide a cost-effective and high-resolution alternative to traditional tactile sensors and also allow leveraging advancements in camera technology and computer vision.


A Machine Learning Approach For Bitcoin Forecasting

Sossi-Rojas, Stefano, Velarde, Gissel, Zieba, Damian

arXiv.org Artificial Intelligence

Bitcoin is one of the cryptocurrencies that is gaining more popularity in recent years. Previous studies have shown that closing price alone is not enough to forecast stock market series. We introduce a new set of time series and demonstrate that a subset is necessary to improve directional accuracy based on a machine learning ensemble. In our experiments, we study which time series and machine learning algorithms deliver the best results. We found that the most relevant time series that contribute to improving directional accuracy are Open, High and Low, with the largest contribution of Low in combination with an ensemble of Gated Recurrent Unit network and a baseline forecast. The relevance of other Bitcoin-related features that are not price-related is negligible. The proposed method delivers similar performance to the state-of-the-art when observing directional accuracy.


A Multidisciplinary Approach to Telegram Data Analysis

Varbanov, Velizar, Kopanov, Kalin, Atanasova, Tatiana

arXiv.org Artificial Intelligence

This paper presents a multidisciplinary approach to analyzing data from Telegram for early warning information regarding cyber threats. With the proliferation of hacktivist groups utilizing Telegram to disseminate information regarding future cyberattacks or to boast about successful ones, the need for effective data analysis methods is paramount. The primary challenge lies in the vast number of channels and the overwhelming volume of data, necessitating advanced techniques for discerning pertinent risks amidst the noise. To address this challenge, we employ a combination of neural network architectures and traditional machine learning algorithms. These methods are utilized to classify and identify potential cyber threats within the Telegram data. Additionally, sentiment analysis and entity recognition techniques are incorporated to provide deeper insights into the nature and context of the communicated information. The study evaluates the effectiveness of each method in detecting and categorizing cyber threats, comparing their performance and identifying areas for improvement. By leveraging these diverse analytical tools, we aim to enhance early warning systems for cyber threats, enabling more proactive responses to potential security breaches. This research contributes to the ongoing efforts to bolster cybersecurity measures in an increasingly interconnected digital landscape.


Adaptive Dropout for Pruning Conformers

Kubo, Yotaro, Cai, Xingyu, Bacchiani, Michiel

arXiv.org Artificial Intelligence

This paper proposes a method to effectively perform joint training-and-pruning based on adaptive dropout layers with unit-wise retention probabilities. The proposed method is based on the estimation of a unit-wise retention probability in a dropout layer. A unit that is estimated to have a small retention probability can be considered to be prunable. The retention probability of the unit is estimated using back-propagation and the Gumbel-Softmax technique. This pruning method is applied at several application points in Conformers such that the effective number of parameters can be significantly reduced. Specifically, adaptive dropout layers are introduced in three locations in each Conformer block: (a) the hidden layer of the feed-forward-net component, (b) the query vectors and the value vectors of the self-attention component, and (c) the input vectors of the LConv component. The proposed method is evaluated by conducting a speech recognition experiment on the LibriSpeech task. It was shown that this approach could simultaneously achieve a parameter reduction and accuracy improvement. The word error rates improved by approx 1% while reducing the number of parameters by 54%.


Turbofan Engine Remaining Useful Life (RUL) Prediction Based on Bi-Directional Long Short-Term Memory (BLSTM)

Sherifi, Abedin

arXiv.org Artificial Intelligence

The aviation industry is rapidly evolving, driven by advancements in technology. Turbofan engines used in commercial aerospace are very complex systems. The majority of turbofan engine components are susceptible to degradation over the life of their operation. Turbofan engine degradation has an impact to engine performance, operability, and reliability. Predicting accurate remaining useful life (RUL) of a commercial turbofan engine based on a variety of complex sensor data is of paramount importance for the safety of the passengers, safety of flight, and for cost effective operations. That is why it is essential for turbofan engines to be monitored, controlled, and maintained. RUL predictions can either come from model-based or data-based approaches. The model-based approach can be very expensive due to the complexity of the mathematical models and the deep expertise that is required in the domain of physical systems. The data-based approach is more frequently used nowadays thanks to the high computational complexity of computers, the advancements in Machine Learning (ML) models, and advancements in sensors. This paper is going to be focused on Bi-Directional Long Short-Term Memory (BLSTM) models but will also provide a benchmark of several RUL prediction databased models. The proposed RUL prediction models are going to be evaluated based on engine failure prediction benchmark dataset Commercial Modular Aero-Propulsion System Simulation (CMAPSS). The CMAPSS dataset is from NASA which contains turbofan engine run to failure events.